-
Notifications
You must be signed in to change notification settings - Fork 83
docs(presto-clp): Add instructions for connecting CLP's UI to Presto; Simplify some set up instructions. #1310
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughUpdated Presto documentation: moved CLP config to a package block ( Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor U as User
participant W as CLP Web UI
participant S as CLP Server
participant P as Presto
rect rgb(235,245,255)
Note over U,W: Option A — Query via CLP Web UI
U->>W: Submit query
W->>S: Forward query (reads presto.host/presto.port)
S->>P: Send query to Presto
P-->>S: Return results
S-->>W: Relay results
W-->>U: Display results
end
sequenceDiagram
autonumber
actor U as User
participant C as Presto CLI
participant P as Presto
rect rgb(245,255,235)
Note over U,C: Option B — Query via Presto CLI (run from tools/deployment/presto-clp or via Docker)
U->>C: Run CLI query (point to presto.host:presto.port)
C->>P: Execute query
P-->>C: Return results
C-->>U: Print results
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal).Please share your feedback with us on this Discord post. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-presto.md(2 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
docs/src/user-docs/guides-using-presto.md
🔇 Additional comments (2)
docs/src/user-docs/guides-using-presto.md (2)
36-44: Confirm storage_engine/query_engine example and explicitly tie to clp-json.Looks good. Please explicitly state that this example applies to the clp-json flavour and that storage_engine must be clp-s for this guide, to avoid users mixing configs.
Apply this small doc tweak?
-2. Before starting the package, update the config (`etc/clp-config.yml`) as follows: +2. Before starting the package, update the config (`etc/clp-config.yml`) as follows (for the `clp-json` package): @@ package: storage_engine: "clp-s" query_engine: "presto"
74-81: Document PRESTO env vars and package settings.
Update the docs to replace placeholders with the actual keys: host/port come from package settings (settings.json: PrestoHost, PrestoPort) and the environment-provided values are PRESTO_CATALOG, PRESTO_SCHEMA and USER (see components/webui/server/src/plugins/app/Presto.ts).
| * Set the `database.host` key to a non-localhost hostname/IP. | ||
|
|
||
| ```yaml | ||
| database: | ||
| # type: "mariadb" | ||
| host: <IP-address> | ||
| # port: 3306 | ||
| # name: "clp-db" | ||
| ``` | ||
|
|
||
| :::{note} | ||
| This change is necessary since the Presto containers run on a Docker network, whereas CLP's database runs on the host network. So localhost refers to two different entities. | ||
| ::: | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick
Recommend host.docker.internal over hard-coding host IP; clarify Linux note.
Hard-coding an IP is brittle. Suggest preferring host.docker.internal (works on macOS/Windows and modern Docker on Linux) and add a Linux fallback.
Proposed doc change:
- * Set the `database.host` key to a non-localhost hostname/IP.
+ * Set the `database.host` to `host.docker.internal` (recommended). If that doesn't resolve on your Linux distro, use your host's LAN IP or the Docker gateway IP.
@@
- database:
- # type: "mariadb"
- host: <IP-address>
+ database:
+ # type: "mariadb"
+ host: host.docker.internal # or your host IP / Docker gateway IP on Linux
# port: 3306
# name: "clp-db"
@@
- This change is necessary since the Presto containers run on a Docker network, whereas CLP's database runs on the host network. So localhost refers to two different entities.
+ Presto runs inside Docker, while CLP's DB runs on the host. `localhost` would point to different networks. `host.docker.internal` bridges the two.📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * Set the `database.host` key to a non-localhost hostname/IP. | |
| ```yaml | |
| database: | |
| # type: "mariadb" | |
| host: <IP-address> | |
| # port: 3306 | |
| # name: "clp-db" | |
| ``` | |
| :::{note} | |
| This change is necessary since the Presto containers run on a Docker network, whereas CLP's database runs on the host network. So localhost refers to two different entities. | |
| ::: | |
| * Set the `database.host` to `host.docker.internal` (recommended). If that doesn't resolve on your Linux distro, use your host's LAN IP or the Docker gateway IP. |
🤖 Prompt for AI Agents
In docs/src/user-docs/guides-using-presto.md around lines 46 to 59, the
instructions currently suggest hard-coding the host IP for database.host; update
the guidance to recommend using host.docker.internal instead of a fixed IP
(which is supported on macOS/Windows and on modern Docker for Linux), and add a
brief Linux fallback note explaining how to enable host.docker.internal or use
the host network or the host's IP (with a pointer to Docker/OS-specific steps)
so readers have a resilient alternative rather than brittle hard-coded
addresses.
| * Set the `results_cache.retention_period` key to `null`. The CLP presto integration does not | ||
| yet provide support for garbage collection. | ||
|
|
||
| ```yaml | ||
| results_cache: | ||
| # host: "localhost" | ||
| # port: 27017 | ||
| # db_name: "clp-query-results" | ||
| # stream_collection_name: "stream-files" | ||
| # | ||
| # # Retention for search results, in minutes. Set to null to disable automatic deletion. | ||
| retention_period: null | ||
| ``` | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick
Call out unbounded cache growth and offer manual cleanup guidance.
Setting retention_period: null disables GC and can cause unbounded growth. Add an explicit warning and a pointer to manual cleanup steps.
Suggested addition:
results_cache:
@@
- # # Retention for search results, in minutes. Set to null to disable automatic deletion.
+ # # Retention for search results, in minutes. Set to null to disable automatic deletion.
retention_period: null
```
+
+ :::warning
+ With `retention_period: null`, the results cache will grow without bounds. Plan periodic manual cleanup or set a finite retention once GC is supported.
+ :::📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| * Set the `results_cache.retention_period` key to `null`. The CLP presto integration does not | |
| yet provide support for garbage collection. | |
| ```yaml | |
| results_cache: | |
| # host: "localhost" | |
| # port: 27017 | |
| # db_name: "clp-query-results" | |
| # stream_collection_name: "stream-files" | |
| # | |
| # # Retention for search results, in minutes. Set to null to disable automatic deletion. | |
| retention_period: null | |
| ``` | |
| * Set the `results_cache.retention_period` key to `null`. The CLP presto integration does not | |
| yet provide support for garbage collection. |
🤖 Prompt for AI Agents
docs/src/user-docs/guides-using-presto.md around lines 60 to 73: the docs
currently state that results_cache.retention_period can be set to null but do
not warn about unbounded cache growth; add a clear warning block right after the
retention_period example that states that setting retention_period: null
disables GC and can cause unbounded growth, and include a short recommendation
to plan periodic manual cleanup or set a finite retention once GC is supported,
plus a pointer or link to the manual cleanup steps or housekeeping guide
elsewhere in the docs.
| To query your logs through Presto, you can use either: | ||
|
|
||
| * The WebUI available at [http://localhost:4000](http://localhost:4000) (if you changed | ||
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values) | ||
| * The Presto CLI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧹 Nitpick
Disambiguate CLP Web UI vs Presto Web UI; add link to Presto UI (port verification).
Bullet reads as “WebUI” but it’s CLP’s web UI. Since this PR adds “setting up presto webui”, include the Presto coordinator UI endpoint too and confirm it matches the configured port.
Suggested wording:
-To query your logs through Presto, you can use either:
+To query your logs through Presto, you can use either:
@@
-* The WebUI available at [http://localhost:4000](http://localhost:4000) (if you changed
-`webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values)
+* The CLP Web UI at [http://localhost:4000](http://localhost:4000) (if you changed
+ `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values)
+* The Presto Web UI at `http://<presto.host>:<presto.port>/` (e.g., `http://localhost:8889/`) to inspect queries and cluster state
* The Presto CLI:📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| To query your logs through Presto, you can use either: | |
| * The WebUI available at [http://localhost:4000](http://localhost:4000) (if you changed | |
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values) | |
| * The Presto CLI: | |
| To query your logs through Presto, you can use either: | |
| * The CLP Web UI at [http://localhost:4000](http://localhost:4000) (if you changed | |
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values) | |
| * The Presto Web UI at `http://<presto.host>:<presto.port>/` (e.g., `http://localhost:8889/`) to inspect queries and cluster state | |
| * The Presto CLI: |
quinntaylormitchell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
coming along! only a few comments to address.
| To query your logs through Presto, you can use either: | ||
|
|
||
| * The WebUI available at [http://localhost:4000](http://localhost:4000) (if you changed | ||
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values) | ||
| * The Presto CLI: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| To query your logs through Presto, you can use either: | |
| * The WebUI available at [http://localhost:4000](http://localhost:4000) (if you changed | |
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values) | |
| * The Presto CLI: | |
| To query your logs through Presto, you can use either the [webUI](#querying-in-the-webui) or the | |
| [Presto CLI](#querying-from-the-presto-cli). | |
| ### Querying in the webUI | |
| The webUI is available at [http://localhost:4000](http://localhost:4000) (if you changed | |
| `webui.host` or `webui.port` in `etc/clp-config.yml`, use the new values). | |
| ### Querying from the Presto CLI | |
| To access the Presto CLI, navigate to the `tools/deployment/presto-clp` directory in a new terminal. | |
| Make sure Presto is still running in whichever terminal you originally started it from. Enter the | |
| following command to start the CLI: |
I think that this warrants being split into subsections, as it's awkward to go from the end of this list (the Presto CLI point) into an un-indented explanation of the Presto CLI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i thought about this, but there is text underneath about certain SQL commands that apply to both. And putting the two headers makes it look like it only applies to the CLI
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah very true, my bad! I do think that could be clearer; I will come up with a format that makes that clearer, and I'll include it with my next review
Co-authored-by: Quinn Taylor Mitchell <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/src/user-docs/guides-using-presto.md (1)
157-165: Use docker compose down for full cleanup (consider -v to remove volumes).
docker compose rmremoves stopped containers but leaves networks/volumes. For a clean slate, preferdownand optionally-v.-To clean up the Presto cluster entirely: +To clean up the Presto cluster entirely: @@ -docker compose rm +docker compose down -v
♻️ Duplicate comments (4)
docs/src/user-docs/guides-using-presto.md (4)
46-55: Prefer host.docker.internal over hard‑coding a host IP (add Linux fallback).Hard‑coding an IP is brittle across networks. Recommend
host.docker.internaland document a Linux fallback.- * Set the `database.host` key to a non-localhost hostname/IP. + * Set the `database.host` to `host.docker.internal` (recommended). If that doesn't resolve on your Linux distro, + use your host's LAN IP or the Docker gateway IP. @@ - host: <IP-address> + host: host.docker.internal # or your host IP / Docker gateway IP on Linux
62-74: Warn that retention_period: null disables GC and can grow unbounded.Call this out explicitly to prevent surprises and suggest manual cleanup.
retention_period: null ``` + + :::warning + With `retention_period: null`, the results cache can grow without bounds. Plan periodic manual cleanup + or set a finite retention once GC is supported. + :::
169-169: Disambiguate “UI” to “CLP Web UI” and add Presto Web UI link.Avoid ambiguity and surface the coordinator UI for query/cluster inspection.
-You can query your compressed logs from CLP’s [UI](#querying-from-the-ui) or the [Presto CLI](#querying-from-the-presto-cli). +You can query your compressed logs from the [CLP Web UI](#querying-from-the-ui) or the +[Presto CLI](#querying-from-the-presto-cli). You can also inspect queries in the Presto Web UI at +`http://<presto.host>:<presto.port>/` (e.g., `http://localhost:8889/` if following this guide).
203-216: Consistency: say “CLP Web UI” in headings and links.Use “Web UI” consistently to match earlier phrasing and reduce ambiguity.
-### Querying from the UI +### Querying from the CLP Web UI
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-presto.md(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
docs/src/user-docs/guides-using-presto.md
🪛 markdownlint-cli2 (0.17.2)
docs/src/user-docs/guides-using-presto.md
203-203: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
207-207: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: lint-check (macos-15)
quinntaylormitchell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only a few comments.
I like the new layout of the "Querying your logs" section, I think that works well.
question: are we going for "webUI" or "UI"? my vote is for "webUI" as it more clearly indicates that it's accessed through a web browser. if you want to stick with "UI", you can ignore my related comments.
also: when I used the Presto webUI, I noticed that queries with semicolons (as described/suggested in the "Querying your logs through Presto" section) didn't work. Removing the semicolon makes them work. is that a known bug? and if so, will it be fixed by the time these docs are merged? if it's not fixed, we should give a warning in these docs.
Co-authored-by: Quinn Taylor Mitchell <[email protected]>
Yes, the webui can only take one query at a time, when the cli can accept multiple. Since there is only one query there is no semi colon required. I dont know if bug or not, but maybe can add something that strips the semicolon. For now, i added a note in the webui. If u want i could instead strip the semicolon from the examples |
quinntaylormitchell
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't tested yet, so I might have some more comments after.
| # stream_collection_name: "stream-files" | ||
| # | ||
| # # Retention for search results, in minutes. Set to null to disable automatic deletion. | ||
| retention_period: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| retention_period: null | |
| retention_period: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i changed the whole indenting for this block in another commit
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-presto.md(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
docs/src/user-docs/guides-using-presto.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (8)
docs/src/user-docs/guides-using-presto.md (8)
41-44: LGTM: package query_engine configThe example sets query_engine to "presto" alongside storage_engine; matches the integration flow.
46-55: Prefer host.docker.internal over hard-coding host IPRecommend defaulting to host.docker.internal with a Linux fallback to avoid brittle, environment-specific IPs.
- * Set the `database.host` key to a non-localhost hostname/IP. + * Set the `database.host` to `host.docker.internal` (recommended). If that doesn't resolve on + your Linux distro, use your host's LAN IP or the Docker gateway IP. @@ - database: - # type: "mariadb" - host: "<non-local-ip-address>" + database: + # type: "mariadb" + host: host.docker.internal # or your host IP / Docker gateway IP on Linux # port: 3306 # name: "clp-db"
56-61: Tighten the networking note; call out host.docker.internalClarify the split networks and mention the recommended bridge hostname explicitly.
- This change is necessary because the Presto containers run on a Docker network, and CLP's - database runs on the host network. `localhost` will refer to a different entity in each of - those contexts. This limitation will be addressed in the future when we unify Presto and CLP's - deployment infrastructure. + Presto runs inside Docker while CLP's database runs on the host. `localhost` points to + different networks in each context. `host.docker.internal` bridges Docker → host on + macOS/Windows and modern Linux; otherwise use your host's IP or the Docker gateway IP.
66-75: Warn about unbounded cache growth when retention_period: nullAdd an explicit warning so users plan housekeeping.
```yaml results_cache: @@ retention_period: null ``` + + :::warning + With `results_cache.retention_period: null`, the results cache can grow without bounds. + Plan periodic manual cleanup, or set a finite retention once GC is supported. + :::
77-85: Make YAML match stated defaults (localhost:8889)Align the example with the paragraph above to reduce ambiguity.
```yaml presto: - host: "<ip-address>" - port: <port> + host: localhost + port: 8889 ```
87-89: Document config split: host/port in config; user/catalog/schema via envState the intentional split so users know where to set which values.
:::{note} Presto doesn't need to be running before you start CLP. ::: + + :::{note} + CLP sets Presto host/port in package config, while user, catalog, and schema are provided via + environment variables. This split keeps package defaults stable and environment‑specific + values outside the package config. + :::
174-176: Disambiguate CLP UI vs Presto Web UI; add Presto UI linkCall the CLP UI out explicitly and include the Presto coordinator UI for inspection/monitoring.
-You can query your compressed logs in your browser from [CLP's UI](#querying-from-clps-ui), or -from the command line using the [Presto CLI](#querying-from-the-presto-cli). +You can query your compressed logs in your browser from [CLP's UI](#querying-from-clps-ui), or +from the command line using the [Presto CLI](#querying-from-the-presto-cli). You can also inspect +queries and cluster state in the Presto Web UI at `http://<presto.host>:<presto.port>/` +(e.g., `http://localhost:8889/`).
77-85: Confirmed — docker‑compose exposes Presto coordinator UI on host port 8889tools/deployment/presto-clp/docker-compose.yaml maps host 8889 -> container 8080 for presto-coordinator, so docs referencing localhost:8889 are correct (docs/src/user-docs/guides-using-presto.md: lines 77–85, 209–216, 218–229).
davemarco
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kirk comments
| # stream_collection_name: "stream-files" | ||
| # | ||
| # # Retention for search results, in minutes. Set to null to disable automatic deletion. | ||
| retention_period: null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i changed the whole indenting for this block in another commit
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/src/user-docs/guides-using-presto.md (1)
148-161: Add Presto Web UI endpointSurface where to view cluster state and queries; aligns with this PR’s objective.
```bash docker compose up ``` * To use more than one Presto worker, you can use the `--scale` option as follows: @@ * Replace `<num-workers>` with the number of Presto worker nodes you want to run. + + The Presto Web UI is available at `http://<presto.host>:<presto.port>/` + (e.g., `http://localhost:8889/`) to inspect queries and cluster state.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-presto.md(3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
docs/src/user-docs/guides-using-presto.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (3)
docs/src/user-docs/guides-using-presto.md (3)
209-216: Scope the semicolon limitation to CLP’s UI; clarify CLI unaffectedPrevents confusion when switching to the CLI.
-:::{note} -The UI can only run one query at a time, and queries must not end with a `;`. -::: +:::{note} +CLP’s UI can only run one query at a time, and queries must not end with a `;`. +This limitation does not apply to the Presto CLI. +:::
224-229: Note CLI supports semicolons and multi‑statementsSet expectations consistent with Presto CLI behaviour.
```bash docker compose exec presto-coordinator \ presto-cli \ --catalog clp \ --schema default
+The Presto CLI accepts semicolon‑terminated statements and supports multiple statements in one
+submission.--- `46-61`: **Prefer host.docker.internal over hard‑coding IP; add Linux fallback** Reduces brittleness and helps most users out‑of‑the‑box. Also tighten the note wording. ```diff - * Set the `database.host` key to a non-localhost hostname/IP. + * Set the `database.host` to `host.docker.internal` (recommended). If that doesn't resolve on + your Linux distro, use your host's LAN IP or the Docker gateway IP. ```yaml - database: - # type: "mariadb" - host: "<non-local-ip-address>" - # port: 3306 - # name: "clp-db" + database: + # type: "mariadb" + host: host.docker.internal # or your host IP / Docker gateway IP on Linux + # port: 3306 + # name: "clp-db" ``` - :::{note} - This change is necessary because the Presto containers run on a Docker network, and CLP's - database runs on the host network. `localhost` will refer to a different entity in each of - those contexts. This limitation will be addressed in the future when we unify Presto and CLP's - deployment infrastructure. - ::: + :::{note} + Presto runs inside Docker, while CLP's DB runs on the host. `localhost` would point to different + networks. `host.docker.internal` bridges the two. This limitation will be addressed when we unify + Presto and CLP's deployment infrastructure. + :::
Co-authored-by: kirkrodrigues <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (1)
docs/src/user-docs/guides-using-presto.md(4 hunks)
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
📚 Learning: 2025-08-25T16:27:50.549Z
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Applied to files:
docs/src/user-docs/guides-using-presto.md
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: antlr-code-committed (macos-15)
- GitHub Check: lint-check (macos-15)
- GitHub Check: lint-check (ubuntu-24.04)
🔇 Additional comments (8)
docs/src/user-docs/guides-using-presto.md (8)
46-55: Prefer host.docker.internal over hard‑coding host IP; add Linux fallback noteUsing a fixed IP is brittle. Recommend host.docker.internal with a brief Linux fallback.
- * Set the `database.host` key to a non-localhost hostname/IP. + * Set `database.host` to `host.docker.internal` (recommended). On Linux where it isn't available, + use your host's LAN IP or the Docker gateway IP. ```yaml database: # type: "mariadb" - host: "<non-local-ip-address>" + host: host.docker.internal # or your host IP / Docker gateway IP on Linux # port: 3306 # name: "clp-db" ```
56-61: Tighten and clarify the Docker networking noteMinor reword for flow and clarity.
- This change is necessary because the Presto containers run on a Docker network, and CLP's - database runs on the host network. `localhost` will refer to a different entity in each of - those contexts. This limitation will be addressed in the future when we unify Presto and CLP's - deployment infrastructure. + Presto runs inside Docker while CLP's database runs on the host. `localhost` points to different + networks in each context. This limitation will be addressed when we unify Presto and CLP's + deployment infrastructure.
63-76: Warn about unbounded cache growth when retention_period is nullWithout GC, the cache can grow until disk exhaustion.
```yaml results_cache: @@ retention_period: null ``` + + :::warning + With `retention_period: null`, the results cache can grow without bounds. Plan periodic manual + cleanup or set a finite retention once GC is supported. + :::
77-85: Make the Presto host/port example match the stated defaults (localhost:8889)Reduces ambiguity and aligns with the paragraph above.
```yaml presto: - host: "<ip-address>" - port: <port> + host: localhost + port: 8889 ```
87-89: Document config responsibility split: host/port in config; user/catalog/schema via envAligns with intentional design in the Web UI plugin.
:::{note} Presto doesn't need to be running before you start CLP. ::: + + :::{note} + Set Presto host/port in package config. Provide user, catalog, and schema via environment + variables. This keeps package defaults stable and environment‑specific values outside the config. + :::
178-180: Include Presto Web UI alongside CLP UI and CLISurface the coordinator UI endpoint and tie it to the configured port.
-You can query your compressed logs in your browser from [CLP's UI](#querying-from-clps-ui), or -from the command line using the [Presto CLI](#querying-from-the-presto-cli). +You can query your compressed logs in your browser from [CLP's UI](#querying-from-clps-ui), inspect +queries and cluster state in the Presto Web UI at `http://<presto.host>:<presto.port>/` +(e.g., `http://localhost:8889/`), or use the [Presto CLI](#querying-from-the-presto-cli).
218-220: Scope the semicolon limitation to CLP’s UI; clarify CLI behaviourAvoids confusion for CLI users.
-:::{note} -The UI can only run one query at a time, and queries must not end with a `;`. -::: +:::{note} +CLP’s UI can only run one query at a time, and queries entered there must not end with a `;`. +This limitation does not apply to the Presto CLI. +:::
228-233: Note that the Presto CLI accepts semicolons and multi‑statement submissionsSets accurate expectations.
presto-cli \ --catalog clp \ --schema default
+The Presto CLI accepts semicolon‑terminated statements and supports submitting multiple statements in one invocation.
</blockquote></details> </blockquote></details> </details> <!-- This is an auto-generated comment by CodeRabbit for review status -->
kirkrodrigues
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the PR title, how about:
docs(presto-clp): Add instructions for connecting CLP's UI to Presto; Simplify some set up instructions.
|
|
||
| ```bash | ||
| docker compose up | ||
| docker compose up --detach |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it might be better to also add the --wait flag so users can be notified if any service fails to start successfully
@coderabbitai add an issue to track
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Create a GitHub issue to track the suggestion
gh issue create \
--repo y-scope/clp \
--title "docs(presto): Add --wait flag to docker compose up command in Presto documentation" \
--body "## Description
Add the \`--wait\` flag to the \`docker compose up --detach\` command in the Presto integration documentation so users can be notified if any service fails to start successfully.
## Current command
\`\`\`bash
docker compose up --detach
\`\`\`
## Suggested command
\`\`\`bash
docker compose up --detach --wait
\`\`\`
## Rationale
The \`--wait\` flag ensures that the command waits until services are running and healthy before returning. This provides better feedback to users if any service fails to start successfully.
## References
- PR: https://github.com/y-scope/clp/pull/1310
- Comment: https://github.com/y-scope/clp/pull/1310#discussion_r2461661299
- Requested by: @junhaoliao" \
--assignee junhaoliao
Length of output: 42
@junhaoliao, I've created issue #1495 to track adding the --wait flag to the docker compose up --detach command in the Presto documentation. This will help users get notified if any service fails to start successfully.
| * Set the `database.host` key to a non-localhost hostname/IP. | ||
|
|
||
| ```yaml | ||
| database: | ||
| # type: "mariadb" | ||
| host: "<non-local-ip-address>" | ||
| # port: 3306 | ||
| # name: "clp-db" | ||
| ``` | ||
|
|
||
| :::{note} | ||
| This change is necessary because the Presto containers run on a Docker network, and CLP's | ||
| database runs on the host network. `localhost` will refer to a different entity in each of | ||
| those contexts. This limitation will be addressed in the future when we unify Presto and CLP's | ||
| deployment infrastructure. | ||
| ::: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(i figure this docs PR should provide better context, although the code for the Presto Docker Compose project was originally added in #1132.)
The mitigation documented above is practical but may be inconvenient for users operating in diverse network environments.
In the long term, we plan to integrate the Presto services into the CLP Package Docker Compose project introduced in #1178. Running du -h tools/deployment/presto-clp shows the directory is only 132 KB, indicating that the integration would not significantly increase the package size while substantially improving user convenience.
@coderabbitai please file an issue to track this long-term integration plan.
Since implementing and reviewing this integration may take time, a short-term solution is needed before the next release. After #1178, the webui service may also encounter connectivity issues with the Presto connector, as host network mode was removed in that update.
A likely mitigation is to allow the Presto project to attach to an external Docker network defined in the CLP Package Compose project.
@coderabbitai please create a separate issue to track this short-term fix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Get the repository owner and name from the current directory
REPO="y-scope/clp"
PR_NUMBER="1310"
COMMENT_URL="https://github.com/y-scope/clp/pull/1310#discussion_r2462666131"
# Create the long-term issue for full Presto integration
echo "Creating long-term issue for Presto integration..."
gh issue create \
--repo "$REPO" \
--title "Integrate Presto services into CLP Package Docker Compose project" \
--body "## Context
As documented in PR #1310, the current Presto integration requires users to set \`database.host\` to a non-localhost hostname/IP because the Presto containers run on a Docker network while CLP's database runs on the host network. This mitigation is practical but inconvenient for users operating in diverse network environments.
## Long-term Solution
Integrate the Presto services into the CLP Package Docker Compose project introduced in #1178. The \`tools/deployment/presto-clp\` directory is only 132 KB in size, so this integration would not significantly increase the package size while substantially improving user convenience.
## Benefits
- Eliminates the need for non-localhost hostname configuration
- Provides a unified deployment experience
- Simplifies network connectivity between CLP and Presto services
## References
- PR #1310: $COMMENT_URL
- Original Presto Docker Compose addition: #1132
- CLP Package Docker Compose project: #1178
- Requested by: @junhaoliao" \
--assignee junhaoliao
echo ""
echo "Creating short-term issue for external Docker network..."
# Create the short-term issue for external Docker network attachment
gh issue create \
--repo "$REPO" \
--title "Allow Presto project to attach to external Docker network for CLP Package Compose" \
--body "## Context
After #1178, the \`webui\` service may encounter connectivity issues with the Presto connector because host network mode was removed in that update. Since implementing and reviewing the full integration (tracked separately) may take time, a short-term solution is needed before the next release.
## Short-term Solution
Allow the Presto project to attach to an external Docker network defined in the CLP Package Compose project. This will enable proper connectivity between the webui service and Presto without requiring full integration.
## Implementation Notes
- This is a temporary mitigation until the full integration is completed
- Should be implemented before the next release
- Will resolve connectivity issues between webui and Presto services
## References
- PR #1310: $COMMENT_URL
- CLP Package Docker Compose project: #1178
- Requested by: @junhaoliao" \
--assignee junhaoliao
echo ""
echo "Both issues created successfully!"
Length of output: 229
@junhaoliao, I've created both issues as requested:
-
Long-term integration: #1497 - Integrate Presto services into CLP Package Docker Compose project
- Addresses the inconvenience of requiring non-localhost hostname configuration
- Proposes full integration of the 132 KB Presto project into the CLP Package Docker Compose
-
Short-term fix: #1498 - Allow Presto project to attach to external Docker network for CLP Package Compose
- Provides a temporary solution before the next release
- Resolves potential connectivity issues between the webui service and Presto connector after feat(deployment)!: Migrate package orchestration to Docker Compose (resolves #1177); Temporarily remove support for multi-node deployments. #1178
Both issues have been assigned to you and include references to PR #1310 and the relevant context.
🧠 Learnings used
Learnt from: davemarco
PR: y-scope/clp#1198
File: components/webui/server/src/plugins/app/Presto.ts:38-43
Timestamp: 2025-08-25T16:27:50.549Z
Learning: In the CLP webui Presto configuration, host and port are set via package settings (configurable), while user, catalog, and schema are set via environment variables (environment-specific). This mixed approach is intentional - settings are typically set by package and some values don't need to be package-configurable.
Description
Modifies presto docs to support starting the webui.
Checklist
breaking change.
Validation performed
Package starts with new webui using these docs
Summary by CodeRabbit